71 research outputs found

    Improved algorithms for topic distillation in a hyperlinked environment

    Get PDF
    Abstract This paper addresses the problem of topic distillation on the World Wide Web, namely, given a typical user query to find quality documents related to the query topic. Connectivity analysis has been shown to be useful in identifying high quality pages within a topic specific graph of hyperlinked documents. The essence of our approach is to augment a previous connectivity analysis based algorithm with content analysis. We identify three problems with the existing approach and devise algorithms to tackle them. The results of a user evaluation are reported that show an improvement of precision at 10 documents by at least 45 % over pure connectivity analysis.

    Fully dynamic cycle-equivalence in graphs

    Get PDF
    Two edges e_1 and e_2 of an undirected graph are cycle-equivalent iff all cycles that contain e_1 also contain e_2, i.e., iff e_1 and e_2 are a cut-edge pair. The cycle-equivalence classes of the control-flow graph are used in optimizing compilers to speed up existing control-flow and data-flow algorithms. While the cycle-equivalence classes can be computed in linear time, we present the first fully dynamic algorithm for maintaining the cycle-equivalence relation. In an n-node graph our data structure executes an edge insertion or deletion in O(sqrt(n.log n)) time and answers the query whether two given edges are cycle-equivalent in O(pow2(log(n))) time. We also present an algorithm for plane graphs with O(log n) update and query time and for planar graphs with O(log n) insertion time and O(log2 n) query and deletion time. Additionally, we show a lower bound of Ω(log n/log log n) for the amortized time per operation for the dynamic cycle-equivalence problem in the cell probe mode

    Hyperlink analysis for the Web

    Get PDF
    Hyperlink analysis algorithms significantly improve the relevance of the search results on the Web, so much so that all major Web search engines claim to use some type of hyperlink analysis. However, the search engines do not disclose details about the type of hyperlink analysis they perform, mostly to avoid manipulation of search results by Web-positioning companies. The article discusses how hyperlink analysis can be applied to ranking algorithms, and surveys other ways Web search engines can use this analysi

    Improved data structures for fully dynamic biconnectivity

    Get PDF
    We present fully dynamic algorithms for maintaining the biconnected components in general and plane graphs. A fully dynamic algorithm maintains a graph during a sequence of insertions and deletions of edges or isolated vertices. Let m be the number of edges and n be the number of vertices in a graph. The time per operation of the best deterministic algorithms is O(sqrt(n)) in general graphs and O(log n) in plane graphs for fully dynamic connectivity and O(minm2/3, n) in general graphs and O(sqrt(n)) in plane graphs for fully dynamic biconnectivity. We improve the later running times to O(sqrt(m.log(n)) in general graphs and O(log2 n) in plane graphs. Our algorithm for general graphs can also find the biconnected components of all vertices in time O(n)

    Combinatorial algorithms for web search engines: three success stories

    Get PDF
    How much can smart combinatorial algorithms improve web search engines? To address this question we will describe three algorithms that have had a positive impact on web search engines: The PageRank algorithm, algorithms for finding near-duplicate web pages, and algorithms for index server loadbalancing

    Fine-Grained Complexity Lower Bounds for Families of Dynamic Graphs

    Get PDF
    A dynamic graph algorithm is a data structure that answers queries about a property of the current graph while supporting graph modifications such as edge insertions and deletions. Prior work has shown strong conditional lower bounds for general dynamic graphs, yet graph families that arise in practice often exhibit structural properties that the existing lower bound constructions do not possess. We study three specific graph families that are ubiquitous, namely constant-degree graphs, power-law graphs, and expander graphs, and give the first conditional lower bounds for them. Our results show that even when restricting our attention to one of these graph classes, any algorithm for fundamental graph problems such as distance computation or approximation or maximum matching, cannot simultaneously achieve a sub-polynomial update time and query time. For example, we show that the same lower bounds as for general graphs hold for maximum matching and (s,t)-distance in constant-degree graphs, power-law graphs or expanders. Namely, in an m-edge graph, there exists no dynamic algorithms with both O(m^{1/2 - ?}) update time and O(m^{1 -?}) query time, for any small ? > 0. Note that for (s,t)-distance the trivial dynamic algorithm achieves an almost matching upper bound of constant update time and O(m) query time. We prove similar bounds for the other graph families and for other fundamental problems such as densest subgraph detection and perfect matching

    Finding related pages in the World Wide Web

    Get PDF
    When using traditional search engines, users have to formulate queries to describe their information need. This paper discusses a different approach to Web searching where the input to the search process is not a set of query terms, but instead is the URL of a page, and the output is a set of related Web pages. A related Web page is one that addresses the same topic as the original page. For example, www.washingtonpost.com is a page related to www.nytimes.com, since both are online newspapers. We describe two algorithms to identify related Web pages. These algorithms use only the connectivity information in the Web (i.e., the links between pages) and not the content of pages or usage information. We have implemented both algorithms and measured their runtime performance. To evaluate the effectiveness of our algorithms, we performed a user study comparing our algorithms with Netscape's `What's Related' service (http://home. netscape, com/escapes/related/). Our study showed that the precision at 10 for our two algorithms are 73% better and 51% better than that of Netscape, despite the fact that Netscape uses both content and usage pattern information in addition to connectivity information

    Maintaining minimum spanning forests in dynamic graphs

    Get PDF
    We present the first fully dynamic algorithm for maintaining a minimum spanning forest in time o(sqrt(n)) per operation. To be precise, the algorithm uses O(n1/3 log n) amortized time per update operation. The algorithm is fairly simple and deterministic. An immediate consequence is the first fully dynamic deterministic algorithm for maintaining connectivity and bipartiteness in amortized time O(n1/3 log n) per update, with O(1) worst case time per query

    Scheduling multicasts on unit-capacity trees and meshes

    Get PDF
    This paper studies the multicast routing and admission control problem on unit-capacity tree and mesh topologies in the throughput model. The problem is a generalization of the edge-disjoint paths problem and is NP-hard both on trees and meshes. We study both the offline and the online version of the problem: In the offline setting, we give the first constant-factor approximation algorithm for trees, and an O((log log n)2)-factor approximation algorithm for meshes. In the online setting, we give the first polylogarithmic competitive online algorithm for tree and mesh topologies. No polylogarithmic-competitive algorithm is possible on general network topologies (Lower bounds for on-line graph problems with application to on-line circuits and optical routing, in: Proceedings of the 28th ACM Symposium on Theory of Computing, 1996, pp. 531-540) and there exists a polygarithmic lower bound on the competitive ratio of any online algorithm on tree topologies (Making commitments in the face of uncertainity: how to pick a winner almost every time, in: Proceedings of the 28th Annual ACM Symposium on Theory of Computing, 1996, pp. 519-530). We prove the same lower bound for meshes
    • …
    corecore